Bioinformatics A Practical Guide to Next Generation Sequencing Data Analysis (Hamid D. Ismail)

278 ◾ Bioinformatics

be the right moment to remove them using “cutadapt” plugin with “trim-single” or “trim-

paired” for single-end or paired-end reads, respectively.

qiime cutadapt trim-single \

--i-demultiplexed-sequences demux.qza \

--p-front GTGCCAGCMGCCGCGGTAA \

--p-error-rate 0 \

--o-trimmed-sequences trimmed-demux.qza \

--verbose

About our yoga data, now we have a clear view about its quality after we have assessed it

using “demux summarize”. Since the data is paired-end reads, we will carry out the quality

control later with clustering and denoising.

7.3.4.2 Clustering and Denoising

After the above preprocessing, the next step in the data analysis is to create features by

either clustering or denoising as discussed above. QIIME2 supports de novo, closed-refer-

ence, and open-reference clustering using VSEARCH plugin and denoising with DADA2

and deblur. Either clustering or denoising is performed to create feature tables and rep-

resentative sequences. Denoising (with DADA2 or deblur) attempts to remove the noises

generated from errors. The features generated by DADA2 and deblur are also called ampli-

con sequence variant (ASVs). Whatever you choose to continue with clustering or with

denoising, it is your sole choice. These techniques were discussed above in detail. In the

following, we will show you how to perform clustering and denoising with QIIME2.

7.3.4.2.1 Clustering

If your plan is to cluster reads into OTUs without denoising, QIIME2 provides “q2-vsearch”

plugin to do just that. This plugin has methods for the three types of clustering: de novo,

closed-reference, and open-reference. The “q2-vsearch” plugin can also perform quality

control; therefore, before running clustering, you may need to do some preprocessing to

the data. The paired-end reads must be merged before processing. In the following, we will

walk you through the steps of clustering to the point of generating feature tables and OTU

representative sequences.

7.3.4.2.1.1 Merging Paired-End Reads

If the data is paired-end reads, the forward and reverse reads must be merged before clus-

tering. The merging is achieved with “join-pairs” method of “q2-vsearch” plugin.

The artifact “demux-yoga.qza” of our example data is in the “inputs” directory. Since the

reads are paired end, we can merge them before clustering. The following script takes the

“demux-yoga.qza” artifact as an input, joins the forward and reverse reads, and creates a

new artifact for the merged reads “demux-yoga-merged.qza”:

qiime vsearch join-pairs \

--i-demultiplexed-seqs inputs/demux-yoga.qza \